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Abstract 

Given  a  point  sufficiently  close  to  a  nondegenerate  basic  feasible  solution  x* 
of  a  linear  program,  we  show  how  to  generate  a  sequence  {pfe}  that  converges 
to  the  0-1  vector  sign(x*)  at  a  Q-cubic  rate.  This  extremely  fast  convergence 
enables  us  to  determine,  with  a  high  degree  of  certainty,  which  variables  will  be 
zero  and  which  will  be  nonzero  at  optimality  and  then  construct  x*  from  this 
information. 
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1  Introduction 


Interior  point  algorithms  for  the  linear  program 

minimize  cTx, 

subject  to  Ax  =  b,  (1) 

x  >  0, 

where  c,  x  €  Rn,  b  G  Rm  and  A  €  RmXn  for  m  <  n,  do  not  search  through  the 
set  of  vertices  of  the  feasible  set  to  obtain  an  optimal  vertex  as  does  the  simplex 
method.  Instead,  a  primal  interior  point  algorithm  generates  a  sequence  of  interior 
(or  strictly  feasible)  points  {xfc}  that  converges  to  an  optimal  solution.  For  such  an 
interior  point  method,  the  construction  of  appropriate  stopping  criteria  is  often  a 
delicate  issue.  Additional  concerns  arise  when  a  highly  accurate  solution  is  required 
or  when  an  optimal  vertex  is  required  as  is  the  case  in  some  applications,  e.g.,  integer 
programming.  It  is  therefore  of  value  to  develop  an  iterative  procedure  with  a  high 
convergence  rate  that  can  locate  an  optimal  vertex  from  a  given  reasonably  good 
approximate  solution. 

In  this  paper,  we  present  and  analyze  a  new  method  for  locating  an  optimal  vertex 
x*  of  a  linear  program  from  a  given  approximate  solution.  Working  with  an  auxiliary 
problem,  this  method  generates  from  the  given  approximate  solution  a  sequence  that 
converges  to  the  0-1  vector  sign(x*).  Our  usage  of  the  sign  function  takes  zero  as  the 
sign  of  zero;  otherwise  it  is  the  standard  usage.  Using  the  information  provided  by 
this  0-1  vector,  x*  can  be  recovered  from  the  linear  constraints  Ax  =  b  by  setting  to 
zero  those  variables  that  have  been  identified  as  zero  and  then  solving  for  the  nonzero 
variables.  The  significance  of  this  approach  is  that  the  convergence  rate  of  this  new 
method  is  Q-cubic  if  the  optimal  vertex  is  nondegenerate. 
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In  anticipation  of  future  use,  we  state  the  dual  linear  program  of  (1) 

maximize  bTy 

subject  to  ATy  +  z  =  c,  (2) 

2  >  0. 

In  (1.2)  2  €  Rn  is  the  vector  of  dual  slack  variables.  We  will  conveniently  assume 
throughout  this  paper  that  the  matrix  A  has  full  rank  m. 

This  paper  is  organized  as  follows.  In  Section  2,  we  present  our  method  for  locating 
the  nondegenerate  optimal  vertex  from  an  approximate  primal  solution.  In  Section 
3  we  show  that  this  method  is  Q-cubically  convergent  and  also  briefly  discuss  its 
implementations  as  well  as  its  dual  variant.  In  Section  4  we  give  several  numerical 
examples  which  demonstrate  the  convergence  behavior  of  our  method.  Concluding 
remarks  are  given  in  the  final  section. 

2  The  Algorithm 

Recently,  Tapia  and  Zhang  [1]  proposed  a  technique  for  identify  optimal  bases  for  inte¬ 
rior  point  methods.  The  technique  was  motivated  by  the  observation  that  essentially 
every  interior  point  method  motivated  and  influenced  by  the  Karmarkar  algorithm 
[2]  involves  a  matrix  of  the  form  DAT  (AD2  AT)~*  AD,  where  D  is  a  diagonal  matrix 
that  changes  at  each  iteration.  In  primal  algorithms  (affine  variants  of  the  Karmarkar 
algorithm,  for  example,  see  [3],  [4]  and  [5]),  we  see  D  =  diag(x).  In  dual  algorithms 
(see  Adler  et  al.  [6]  and  Monma  and  Morton  [7],  for  example),  we  see  D  —  [diag(z)]~1 , 
where  2  is  the  vector  of  dual  slack  variables.  And  in  primal-dual  algorithms  (see  Ko- 
jima  et  al.  [8],  for  example),  we  see  D2  —  diag(x)[diag(z)]~1 .  In  [1],  Tapia  and  Zhang 
showed  that  under  suitable  assumptions,  optimal  basis  information  can  be  obtained 
from  the  diagonal  of  this  matrix  D AT (AD2 AT)~l  AD  which  they  called  the  indicator 
vector  or  simply  the  indicator.  Ye  and  Todd  (1987)  [9]  perhaps  were  the  first  to  point 
out  that  the  diagonal  elements  of  such  a  matrix  contain  valuable  information. 
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For  a  fixed  matrix  A  G  RmXn(m  <  n ),  consider  the  matrix-valued  function  H  : 
Rn  Rnx«  defined  by 

H(d)  =  DAT(AD2AT)+AD,  (3) 

where  d  G  R",  D  —  diag(d)  and  the  superscript  “+”  denotes  the  generalized  inverse. 
The  Tapia-Zhang  indicator  is  defined  as  the  function  q  :  R"  — >  R"  obtained  as  the 
diagonal  of  H(d),  i.e., 

q(d)  =  diag(H(d))  or  qi(d)  —  Ha(d),  i  =  1, 2, 3, n.  (4) 

This  function  q(d )  is  defined  for  all  d  G  R”;  however  it  will  not  be  continuous 
at  points  d  where  the  matrix  AD2AT  changes  rank  in  every  neighborhood  of  d.  At 
points  d  where  AD2AT  has  constant  rank  in  some  neighborhood  of  d,  q(d)  will  be 
infinitely  smooth.  It  is  not  hard  to  see  that 

0  <  q(d)  <  1 

for  all  d  G  Rn  because  both  H (d)  and  I  —  H(d)  are  orthogonal  projections  (satisfying 
PT  =  P  and  P 2  =  P )  and,  therefore,  are  positive  semi-definite  with  non-negative 
diagonals. 

We  now  define  a  new  function  u  :  Rn  ->  R"  as 

u(d )  =  q(Dd).  (5) 

where  D  =  diag(d).  It  is  evident  from  the  definition  of  the  function  q  that 

u(d)  =  diag(D2AT(AD4AT)+AD2). 

Our  method  for  locating  an  optimal  vertex  x*  of  a  linear  program  from  an  ap¬ 
proximate  primal  solution  x  can  be  described  as  follows. 

Algorithm  1  Given  an  approximate  primal  solution  x  of  the  linear  program  (1)  and 
a  small  number  e  >  0,  set  p°  —  x.  For  k  =  1,2,...,  until  the  criterion  (6)  (defined 
below)  is  satisfied ,  compute 

Pk  = 
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The  stopping  criterion  is  satisfied  when 

{i  :  Pi  >  1  -  e,  1  <  i  <  n}  |J{*  :  p*  <  e,  1  <  i  <  n}  =  {1, 2, n}.  (6) 

In  the  algorithm,  the  tolerance  £  is  allowed  to  be  zero  for  theoretical  purposes.  In 
practice,  e  will  be  chosen  as  a  small  positive  number. 

The  procedure  of  recovering  the  optimal  vertex  is  as  follows.  First  solve  the 
square  linear  system  A\w  =  b  for  to,  where  A\  is  the  submatrix  of  A  consisting  of  the 
columns  of  A  which  correspond  to  p1-  >  1  —  £  arranged  in  ascending  order  of  their 
column  indices.  Then  construct  the  optimal  vertex  x*  by  extending  w  to  an  n-vector 
by  inserting  zeros  in  the  positions  corresponding  to  p1-  <  e. 


3  The  Cubic  Convergence 


We  first  state  without  proof  two  lemmas  concerning  the  properties  of  the  function 
q(d)  that  we  will  use  in  this  section.  These  results  were  proved  in  [1]  and  interested 
readers  are  referred  to  that  paper. 


Lemma  3.1  For  d  6  Rn  let  q(d )  be  given  by  (4)  for  some  A  of  full  row  rank.  Con¬ 
sider  the  n-dimensional  vector  d*  where  some  components  of  d*  may  be  infinite.  As¬ 
sume  that  the  components  of  d*  can  be  divided  into  two  sets:  SQ  and  Sp,  such  that 


1.  Sa  contains  m  and  Sp  contains  n  —  m  components  of  d* ; 

2.  all  elements  in  Sa  are  non-zero  and 

max{]d*[  :  d*  6  Sp}  .  , 

min{|«£|  :  d*{  G  Sa) 

S.  the  m  by  m  submatrix  of  A  consisting  of  columns  corresponding  to  the  compo¬ 
nents  in  Sa  is  nonsingular. 


Then,  as  d  converges  to  d* 


Jim  qi(d)  =qi(d*) 


1,  if  d*  G  Sa, 
0,  ifd*  €  Sp. 


(8) 
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Since  d*  may  have  infinite  components,  we  define  the  derivatives  of  q  at  d*  by 
continuity,  for  example,  for  d  £  Rn, 

Vqfd*)  =  lim  V?i(d),  i  —  1,2 

d — *-d* 

Lemma  3.2  Let  q(d)  be  given  as  in  (4)  and  d *  satisfy  the  conditions  of  Lemma  3.1. 
Then  q(d)  is  at  least  twice  continuously  differentiable  at  d* .  Moreover,  the  Jacobian 
matrix  of  q(d)  vanishes  at  d* ;  i.e., 

Vq{(d*)  =  0,  i  —  1,2, ...,  n.  (9) 

It  should  not  be  a  surprise  that  the  Jacobian  matrix  of  q(d)  is  zero  at  d*.  According 
to  Lemma  3.1  either  the  maximum  (qi(d*)  =  1)  or  the  minimum  ( qfd *)  =  0)  is  reached 
at  d*  for  every  component  of  q(d). 

An  immediate  consequence  of  Lemma  3.2  is  that  whenever  V2^(d)  exists  and  is 
continuous  at  d* ,  then 

\\q(d)-q*\\<0(\\d-d*f).  (10) 

We  now  need  the  following  two  technical  results  in  order  to  prove  our  main  con¬ 
vergence  result. 

Lemma  3.3  Let  d*  £  R"  be  finite  and  satisfy  the  three  conditions  in  Lemma  3.1. 
Then  there  exist  positive  constants  8q  and  Co  such  that  for  all  d  £  Rn  and  \\d— d*\\  <  8o 

M<0-p*II  <  Colli  -<rf,  (11) 

where  p*  is  a  zero-one  vector  of  m  ones  and  n  —  m  zeros  and  p*  =  sign{\d*\). 

Proof:  Since  d*  is  finite  and  satisfies  the  three  conditions  in  Lemma  3.1  with  all 
nonzero  elements  in  Sa  and  zeros  in  Sp,  so  is  D*d*  where  D*  =  diag(d*).  Therefore, 
by  Lemma  3.1  u(d)  =  q{Dd )  converges  to  p*  as  d  converges  to  d*.  Also  u(d)  is 
infinitely  differentiable  at  D*d* . 
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A  direct  calculation  shows  (see  [1]  for  details)  that 


2 Pe((d*),  if  i  =  j  =  i  and  d\  =  0, 

—2 (d}Pie(d*))2 ,  if  i  =  j  ^  l,  d*t  0  and  d*  —  0, 

0  otherwise, 

where  Pij(d)  is  the  (i,  j)-th  element  of  the  matrix  AT (AD2  AT)~1  A.  That  is,  if  d*t  =  0, 
the  Hessian  of  qe(d)  at  d*  has  only  a  single  non-zero  element  in  the  (£,£)  position; 
and  if  d*t  ^  0,  then  the  Hessian  of  qe(d)  at  d*  has  only  non-zeros  on  the  diagonal 
in  the  positions  corresponding  to  zero  d*’ s.  Therefore,  if  we  expand  q(Dd)  at  D*d* 
using  a  Taylor  expansion  and  note  that  its  first  derivative  is  zero  at  D*d *  (as  proved 
in  Lemma  3.2),  we  have 

N<0  -  P*)ll  =  h(Dd)  -  P-) ||  <  0(\\Dkzdkz\\2)  +  0(\\Dkdk  -  D'd* ||3), 

where  dk  €  Rn_m  consists  of  the  components  of  dk  that  converge  to  zero  and  Dk  = 
diag(dk).  From  the  fact  that  ( Dkdk)i  =  (df)2,  one  can  easily  verify  that 

\\Dkdk\\2  <\\dk -d*\\4  and  \\Dkdk  —  Z>*d*||3  =  0(\\dk  —  d*||3). 

The  existence  of  constants  60  and  Co  such  that  (11)  holds  follows  in  a  standard  fashion 
from  the  continuity  of  the  third  derivative  of  q(Dd)  at  D*d*.  This  completes  the  proof. 
□ 

Corollary  3.1  Let  p*  €  R”  be  a  zero-one  vector  with  m  ones  and  n  —  m  zeros.  If  the 
m  columns  of  the  matrix  A  corresponding  to  the  ones  in  p*  are  linearly  independent, 
then  there  exist  positive  constants  6  and  C  such  that  for  all  p  G  Rn  and  ||p  —  p*||  <  8 

IWp)  -p*II  <  C'IIp  —  P*||3-  (12) 

Proof:  The  proof  follows  directly  from  Lemma  3.3  because  p*  is  finite  and  satisfies 
the  three  conditions  in  Lemma  3.1.  □ 

Now  we  are  ready  to  state  and  prove  the  Q-cubic  convergence  result  for  Algo¬ 
rithm  1. 


d2qi{d*) 

ddiddj 
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Theorem  3.1  Let  {pk}  be  generated  by  Algorithm  1  with  e  —  0.  If  x  is  sufficiently 
close  to  a  nondegenerate  optimal  vertex  x*  of  the  linear  program  (1),  then  {pk}  — »  p * 
Q-cubically,  where  p*  =  sign(x*)  is  a  zero-one  vector  of  m  ones  and  n  —  m  zeros. 

Proof:  Choose  x  such  that  ||x  —  x*]|  <  min(<S0, 6)  and  p  =  max(Co, C) ||a;  —  £*||2  <  1, 
where  60,Co  and  6,  C  are  positive  constants  satisfying  (11)  (with  d  replaced  by  x) 
and  (12),  respectively.  The  proof  of  convergence  now  follows  by  induction.  Let 
e0  =  || cu  —  a:*||  and  e*,  =  ||pfc  —  p*\\  for  k  >  0.  By  Lemma  3.3  (with  d  replaced  by  x) 
and  our  assumptions, 

ei  =  ||u(p°)  -  p* ||  <  (7o||a:  -  ar*||3  <  pe0  <  e0. 

So  ei  <  6.  By  Corollary  3.1  and  our  assumptions, 

e2  =  IKp1)  —  p*||  <  CWp1  -  q* ||3  <  pex  <  ex. 

The  induction  step  can  now  proceed  in  the  same  manner.  This  establishes  the  Q-linear 
convergence.  The  Q-cubic  convergence  follows  directly  from  (12).  □ 

Numerical  implementation  of  Algorithm  1  can  be  carried  out  basically  in  two 
ways.  First,  if  an  orthonormal  basis  Q(d)  of  D2AT  is  computed  by  a  QR  method  or 
an  SVD  method,  then  obviously 

Ui(d)  =  Qi(d)TQi(d),  i  =  l,2,...,n,  (13) 

where  Qi(d )  is  the  i-th  row  of  Q(d).  On  the  other  hand,  if  a  Cholesky  factor  L(d)  of 
ADaAt  is  computed,  the  triangular  system  L(d)Q(d)  =  AD 2  needs  to  be  solved  to 
obtain  Q(d).  Then  u(d)  can  be  computed  from  (13). 

In  the  case  that  the  dual  linear  program  (2)  is  being  solved,  a  dual  version  of 
Algorithm  1  can  be  devised  by  letting  p°  =  1/z,-,  i  =  1,2, ...  ,n,  where  z  is  a  given 
approximation  to  an  optimal  dual  slack  vector  z*.  It  can  be  easily  shown  that  this 
dual  version  has  at  least  Q-cpmclratic  convergence  rate  for  nondegenerate  problems 
because  of  the  estimate 

\\q(d(z))  -  q*\\  <  0(\\z  -  z*f) 
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proved  in  [1],  where  di(z)  =  l/z,,  i  =  1, 2, . . . ,  n.  Also  such  generated  sequence  {pk} 
has  the  same  limit  p*  =  sign(x*). 

4  Numerical  Examples 

We  have  tested  our  method  using  randomly  generated  problems.  All  our  random 
numbers  are  uniformly  distributed  on  the  entire  real  line.  For  each  problem,  we 
generate  an  m  by  n  matrix  A  and  a  vertex  x*.  For  simplicity,  we  make  the  first  m 
elements  of  x*  nonzero  and  the  rest  zero.  Correspondingly,  we  make  sure  that  the 
first  m  columns  of  A  are  linearly  independent.  Also  we  generate  a  perturbation  vector 
h  and  let  the  initial  vector  x  for  Algorithm  1  be  x  =  x*  +  r/j/||/t||2.  Notice  that  the 
parameter  r  is  equal  to  the  quantity  ||a;  —  rc*|j2.  We  tested  Algorithm  1  on  a  Sun  3/50 
Workstation  with  a  machine  epsilon  of  about  2.22  x  10~16.  Numerical  results  for  four 
such  randomly  generated  problems  with  various  dimensions  are  tabulated  in  Table  1. 
In  this  table,  the  errors  \\pk  —  p*  1 1 2 ,  k  =  1,2, 3, 4,  are  given  for  each  problem,  where 
k  is  the  iteration  count. 


Table  1:  Numerical  results  for  4  random  problems 


Errors 

n  =  10,  to  =  5 

n  =  20,  m  =  10 

n  —  40,  m  =  20 

s 

II 

oo 

o 

3 

II 

o 

II*- *11 

6.00e-02 

7.00e-02 

6.00e-02 

8.00e-02 

lb1  -p*II 

3.56e-01 

5.48e-02 

1.30e-02 

2.63e-02 

lb2  -  pH 

7.91e-03 

7.20e-03 

1.48e-04 

1.45e-04 

lb3-p*ll 

2.89e-09 

6.34e-06 

7.78e-12 

1.80e-12 

lb4  -p*II 

9.16e-16 

9.93e-16 

1.49e-15 

3.08e-15 

It  can  be  seen  from  the  table  that  the  convergence  rate  of  Algorithm  1  was  indeed 
Q-cubic  (up  to  the  point  where  machine  epsilon  was  reached).  We  also  observed 
(which  is  no  surprise)  cases  where  x  was  not  close  to  x*,  and  {pk}  converged  to  a 
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zero-one  vector  corresponding  to  a  nearby  vertex. 


5  Concluding  Remarks 

Working  with  an  auxiliary  problem,  we  have  shown  that  the  information  needed 

to  construct  the  nondegenerate  optimal  vertex  of  a  linear  program  can  be  obtained 

iteratively  from  a  given  good  approximate  solution  at  a  Q- cubic  convergence  rate. 

While  this  result  certainly  has  theoretical  value,  considerably  more  research  is  needed 

in  order  to  determine  its  practical  value. 
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